Plagiarism and authorship analysis: introduction to the special issue

نویسندگان

  • Efstathios Stamatatos
  • Moshe Koppel
چکیده

The Internet has facilitated both the dissemination of anonymous texts as well as easy ‘‘borrowing’’ of ideas and words of others. This has raised a number of important questions regarding authorship. Can we identify the anonymous author of a text by comparing the text with the writings of known authors? Can we determine if a text, or parts of it, has been plagiarized? Such questions are clearly of both academic and commercial importance. The task of determining or verifying the authorship of an anonymous text based solely on internal evidence is a very old one, dating back at least to the medieval scholastics, for whom the reliable attribution of a given text to a known ancient authority was essential to determining the text’s veracity. More recently, the problem of authorship attribution has gained greater prominence due to new applications in forensic analysis, humanities scholarship, and electronic commerce, and the development of computational methods for addressing the problem. Over the last century and more, a great variety of methods have been applied to authorship attribution problems of various sorts. One can roughly trace the evolution of methods through three main stages. In the earliest stage researchers sought a single numeric function of a text to discriminate between authors. In a later stage, statistical multivariate discriminant analysis was applied to word frequencies and related numerical features. Most recently, machine learning methods and highdimensional textual features have been applied to sets of training documents to

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An introduction to the examples of scientific plagiarism and its identification soft-wares

Background: Increasing Immorality and Plagiarism in the country's higher education system has become a serious crisis. Hence, the purpose of this study was to analyze the Examples of Plagiarism and the introduction of Plagiarism detection software. Method: The present study is a narrative review study. Articles in Persian and Latin related to the use of scientific theft key words in databases w...

متن کامل

A Survey on Authorship Analysis

The paper discusses about the problem of Authorship analysis, different types of authorship analysis’s such as authorship attribution, authorship identification, authorship profiling, plagiarism detection. It also addresses the issues in Indian language text. Keywords— Authorship attribution, authorship profiling, plagiarism detection, text classification.

متن کامل

Pathology Analysis of Plagiarism: A Qualitative Research

Introduction: Today with development of university, that have the responsibility for scientific and ethical training of educated generation, plagiarism is not limited to special people and its ignorance especially in academic area will bring terrible consequence. Thus, it is necessary to discover reasons of plagiarism for preventing its growing development. This study aimed to explore experienc...

متن کامل

Authorship and Plagiarism Detection Using Binary BOW Features

Identifying writing style shifts and variations are fundamental capabilities when addressing authorship related tasks. In this work we examine a simplified approach for unsupervised authorship and plagiarism detection which is based on binary bag of words representation. We evaluate our approach using PAN-2012 Authorship Attribution challenge data, which includes both open/closed class authorsh...

متن کامل

A Fuzzy Logic Approach to Computer Software Source Code Authorship Analysis

Software source code authorship analysis has become an important area in recent years with promising applications in both the legal sector (such as proof of ownership and software forensics) and the education sector (such as plagiarism detection and assessing style). Authorship analysis encompasses the sub-areas of author discrimination, author characterization, and similarity detection (also r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Language Resources and Evaluation

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2011